Giới thiệu về PyTorch: Vì sao Tensor lại quan trọng

PyTorch là một khung phần mềm mã nguồn mở linh hoạt và động, được ưa chuộng trong nghiên cứu học sâu và phát triển nhanh. Ở cốt lõi của nó là Tensor là cấu trúc dữ liệu không thể thiếu. Nó là một mảng đa chiều được thiết kế để xử lý hiệu quả các thao tác số học cần thiết cho mô hình học sâu, hỗ trợ tính năng tăng tốc GPU tăng tốc GPU một cách tự động.

1. Hiểu cấu trúc Tensor

Mọi đầu vào, đầu ra và tham số mô hình trong PyTorch đều được bao bọc trong một Tensor. Chúng có chức năng tương tự như mảng NumPy nhưng được tối ưu để xử lý trên các thiết bị chuyên dụng như GPU, giúp chúng hiệu quả hơn rất nhiều đối với các phép toán đại số tuyến tính quy mô lớn mà mạng nơ-ron yêu cầu.

Các thuộc tính chính xác định Tensor:

Hình dạng: Xác định kích thước dữ liệu, được biểu diễn dưới dạng bộ (ví dụ: $4 \times 32 \times 32$ cho một tập ảnh).
Kiểu dữ liệu: Xác định kiểu số của các phần tử được lưu trữ (ví dụ: torch.float32 cho trọng số mô hình, torch.int64 cho chỉ mục).
Thiết bị: Chỉ vị trí phần cứng vật lý: thường là 'cpu' hoặc 'cuda' (GPU NVIDIA).

Đồ thị động và Autograd

PyTorch sử dụng mô hình thực thi mệnh lệnh, nghĩa là đồ thị tính toán được xây dựng khi các thao tác được thực hiện. Điều này cho phép công cụ đạo hàm tự động tích hợp sẵn, Autograd, để theo dõi mọi thao tác trên Tensor, miễn là thuộc tính requires_grad=True được thiết lập, cho phép tính toán gradient một cách dễ dàng trong quá trình truyền ngược.

TERMINALbash — pytorch-env

> Ready. Click "Run" to execute.

TENSOR INSPECTOR Live

Run code to inspect active tensors

Question 1

Which command creates a $5 \times 5$ tensor containing random numbers following a uniform distribution between 0 and 1?

torch.rand(5, 5)

torch.random(5, 5)

torch.uniform(5, 5)

torch.randn(5, 5)

Question 2

If tensor $A$ is on the CPU, and tensor $B$ is on the CUDA device, what happens if you try to compute $A + B$?

An error occurs because operations require tensors on the same device.

PyTorch automatically moves $A$ to the CUDA device and proceeds.

The operation is performed on the CPU, and the result is returned to the CPU.

Question 3

What is the most common data type (dtype) used for model weights and intermediate calculations in Deep Learning?

torch.float32 (single-precision floating point)

torch.int64 (long integer)

torch.bool

torch.float64 (double-precision floating point)

Challenge: Tensor Manipulation and Shape

Prepare a tensor for a specific matrix operation.

You have a feature vector $F$ of shape $(10,)$. You need to multiply it by a weight matrix $W$ of shape $(10, 5)$. For matrix multiplication (MatMul) to work, $F$ must be 2-dimensional.

Step 1

What should the shape of $F$ be before multiplication with $W$?

Solution:
The inner dimensions must match, so $F$ must be $(1, 10)$. Then $(1, 10) @ (10, 5) \rightarrow (1, 5)$.
Code: F_new = F.unsqueeze(0) or F_new = F.view(1, -1)

Step 2

Perform the matrix multiplication between $F_{new}$ and $W$ (shape $(10, 5)$).

Solution:
The operation is straightforward MatMul.
Code: output = F_new @ W or output = torch.matmul(F_new, W)

Step 3

Which method explicitly returns a tensor with the specified dimensions, allowing you to flatten the tensor back to $(50,)$? (Assume $F$ was $(5, 10)$ initially and is now flattened.)

Solution:
Use the view or reshape methods. The fastest way to flatten is often using -1 for one dimension.
Code: F_flat = F.view(-1) or F_flat = F.reshape(50)